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Abstract 

We discuss the application of random projections to the fundamental problem of deciding 
whether a given point in a Euclidean space belongs to a given set. We show that, under a number 
of different assumptions, the feasibility and infeasibility of this problem are preserved with high 
probability when the problem data is projected to a lower dimensional space. Our results are 
applicable to any algorithmic setting which needs to solve Euclidean membership problems in a 
high-dimensional space. 


1 Introduction 


Random projections are very useful dimension reduction techniques which are widely used in com¬ 
puter science [Tills]. We assume we have an algorithm A acting on a data set X consisting of n 
vectors in M™', where m is large, and assume that the complexity of A depends on m and re in a way 
that makes it impossible to run A sufficiently fast. A random projection exploits the statistical prop¬ 
erties of some random distribution to construct a mapping which embeds X into a lower dimensional 
space (for some appropriately chosen k) while preserving distances, angles, or other quantities 
used by A. 

One striking example of random projections is the famous Johnson-Lindenstrauss lemma [9]: 

1.1 Theorem (Johnson-Lindenstrauss Lemma) 

Let X be a set of m points in and e > 0. Then there is a map F : M™' —)• where k is 

such that for any x,y £ X, we have 

(1 - e)||x - y\\l < \\Fix) - F{y)\\l < (1 + ,)||x - y\\l (1) 

Intuitively, this lemma claims that X can be projected in a much lower dimensional space whilst 
keeping Euclidean distances approximately the same. The main idea to prove Thm. fTTTl is to construct 
a random linear mapping T (called JL random mapping onwards), sampled from certain distribution 
families, so that for each x £ MX, the event that 

{l-e)\\x\\l<\\T{x)\\l<{l+e)\\xg 
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occurs with high probability. By Eq. ([2]) and the union bound, it is possible to show the existence of 
a map F with the stated properties (see mm- 

In this paper we employ random projections to study the following general problem: 

Euclidean Set Membership Problem (ESMP). Given p e M™' and X C M™, decide 
whether p £ X. 

This is a fundamental class consisting of many problems, both in P (e.g. the Linear Eeasibil- 
ITY Problem (LEP)) and NP-hard (e.g. the Integer Eeasibility Problem (IFP), which can 
naturally model SAT, and also see m)- 

In this paper, we use a random linear projection operator T to embed both p and X to a lower 
dimensional space, and study the relationship between the original membership problem and its 
projected version: 

Projected ESMP (PESMP). Given p,X,T as above, decide whether T{p) G T{X). 

Note that, when p £ X ,the fact that T{p) G T{X) follows by linearity of T. We are therefore only 
interested in the case when p ^ X, i.e. we want to estimate Prob(r(p) ^ T{X)), given that p ^ X. 


1.1 Previous results 


Random projections applying to some special cases of membership problems have been studied in P, 
where we exploited some polyhedral structures of the problem to derive several results for polytopes 
and polyhedral cones. In the case X is a polytope, we obtained the following result. 

1.2 Proposition (mi) 

Given ai,... ,an G M™', let C = convjai,..., a^}, b G M™' such that b ^ C, d = min \\b — x|| and 

x&C 

D = max ||6 — aj||. Let T : M™' —)• be a JL random mapping. Then 

l<i<n 

Prob(r(6) i T{C)) > 1 - 

for some constant C (independent of m, n, k,d,D) and e < 


If X is a polyhedral cone, we obtained the following result. 

1.3 Proposition ([TTj) 

Given 6, oi,..., G M”* of norms 1 such that b ^ C = conejai,..., an}, let d 
T : —>■ he a JL random mapping. Then: 

Prob(T(6) ^ T{C)) > 1 - 2n(n + 


for some constant C (independent of m, n, k,d), where e = ^2 +i ’ 

PA = iRax{||x||A I X G cone(ai,..., a^) A ||a;|| < 1}, 
and ||x||a = min { 9i\0>0Ax = Yli ditti} is the norm induced by A = (ai,. 


min 116 
x&C 


) ®n) ■ 


x|| and 
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We also recall the following Lemma, useful for the integer case. 

1.4 Lemma ([11]) 

Let T : —)• he a JL random mapping, let 6, ai,..., G and let X C M”* he a Gnite set. 

Then if b / ViCLi for all y £ X, we have 

m 

Prob (Vy G X I T{b) / ^yiT(a,)) > 1 - 2|X|e-''^ 

i=l 

for some constant C > 0 (independent of m,k). 

1.2 New results 

In this paper, we consider the general case where the data set X has no specific structure, and use 
Gaussian random projections in our arguments to obtain some results about the relationship between 
ESMP and PESMP. 

In the case when X is at most countable (i.e. finite or countable), using a straightforward argu¬ 
ment, we prove that these two problems are equivalent almost surely. However, this result is only 
of theoretical interest due to round-off errors in floating point operations, which make its practical 
application difficult. We address this issue by introducing a threshold 5 > 0 with a corresponding 
Threshold ESMP (TESMP): if A is the distance between T{p) and the closest point of T{X), 
decide whether A > 5. 

In the case when X may also be uncountable, we employ the doubling constant of X, i.e. the 
smallest number Ax such that any closed ball in X can be covered by at most Ax closed balls of 
half the radius. Its logarithm log 2 Ax is called doubling dimension of X. Recently, the doubling 
dimension has become a powerful tool for several classes of problems such as nearest neighbor mu, 
low-distortion embeddings [3], clustering m- 

We show that we can project X into M^, where k = 0(log2 Ax), whilst still ensure the equivalence 
between ESMP and PESMP with high probability. We also extend this result to the threshold case, 
and obtain a more useful bound for k. 


2 Finite and countable sets 


In this section, we assume that X is either hnite or countable. Let T be a JL random mapping from 
a Gaussian distribution, i.e. each entry of T is independently sampled from AA(0,1). It is well known 
that, for an arbitrary unit vector a G the random variable UTap has a Ghi-squared distribution 

xl with k degrees of freedom ([Hj). Its corresponding density function is where 

r(-) is the gamma function. By |3|, for any 0 < J < 1, taking z = ^ yields a cumulative distribution 
function 


F.i6) < (ze^-^fl‘^ < ( ze )^/2 ^ 



( 3 ) 


Thus, we have 


Prob(||ra|| <b) = FAb^) < {2,b^fl‘^ 


(4) 
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or, more simply, Prob(||Ta|| < 6) < 6^ when k > 3. 

Using this estimation, we immediately obtain the following result. 

2.1 Proposition 

Given p G M™' and X C M™, at most countable, such that p ^ X. Then, for a Gaussian random 
projection T : M"* —)• with any k > 1, we have T{p) ^ T{X) almost surely, i.e. Prob(T(p) ^ 

T{X)) = 1. 

Proof. First, note that for any u ^ 0, Tu ^ 0 holds almost certainly. Indeed, without loss of 
generality we can assume that ||u|| = 1. Then for any 0 < 5 < 1: 

Prob(r(z) = 0) < Prob(||rz|| < <5) = ^ 0 as <5 ^ 0. 

Since the event T(p) ^ T(X) can be written as the intersection of at most countably many almost 
sure events T(p) / T(x) (for x G X), it follows that Prob(T(p) ^ T(X')') = 1, as claimed. □ 


Proposition 12.II is simple, but it looks interesting because it suggests that we only need to project 
the data points to a line (i.e. A: = 1) and study an equivalent membership problem on a line. 
Furthermore, it turns out that this result remains true for a large class of random projections. 

2.2 Proposition 

Let u be a probability distribution on with bounded Lebesgue density f. Let Y C be an at 
most countable set such that 0 ^ Y. Then, for a random projection T : —)• sampled from v, 

we have 0 ^ T{Y) almost surely, i.e. Prob(0 ^ T(Y)j = 1. 

Proof. For any 0 ^ y £ Y, consider the set £y = {T : M”* —)• | r(y) = 0}. If we regard each 

T : R™ —>■ R^ as a vector t G R"*, then £y is a hyperplane {t G R”^| y ■ t = 0} and we have 

Prob(T(y) = 0) = iy{£y) = f fdp < ||/||oo f dp = 0 

t/ Sy 'J Sy 

where p denotes the Lebesgue measure on R™. The proof then follows by the countability of Y, 
similarly to Proposition 12.II □ 


Proposition 12.21 is based on the observation that the degree [R : Q] of the field extension R/Q 
is 2^°, whereas Y is countable; so the probability that any row vector Tj of the random projection 
matrix T will yield a linear dependence relation = 0 some 0 7 ^ y G T is zero. In 

practice, however, Y is part of the rational input of a decision problem, and the components of 
T are rational: hence any subsequence of them is trivially linearly dependent over Q. Moreover, 
floating point numbers have a bounded binary representation: hence, even if Y is finite, there is a 
nonzero probability that any subsequence of components of T will be linearly dependent by means 
of a nonzero multiplier vector in Y. 

This idea, however, does not work in practice: we tested it by considering the ESMP given by 
the IPF defined on the set {x G Z” n [L,U] \ Ax = b}. Numerical experiments indicate that the 
corresponding PESMP {x G Z” n [L,U] \ T{A)x = T{h)}, with T consisting of a one-row Gaussian 
projection matrix, is always feasible despite the infeasibility of the original IPE. Since Prop. 12.11 
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assumes that the components of T are real numbers, we think that the reason behind this failure is 
the round-off error associated to the floating point representation used in computers. Specifically, 
when T{A)x is too close to T[b), floating point operations will consider them as a single point. 
In order to address this issue, we force the projected problems to obey stricter requirements. In 
particular, instead of only requiring that T{p) ^ T{X), we ensure that 


dist(T(p),r(X)) = min ||r(p) -r(x)|| > r, 

x^X 

where dist denotes the Euclidean distance, and r > 0 is a (small) given constant. With this restriction, 
we obtain the following result. 

2.3 Proposition 

Given t, 6 > 0 and p ^ X C M"*, where X is a Enite set, let 

d = min \\p — 3:|| > 0. 
x^X 


Let T 


j / I X I \ 

be a Gaussian random projection with k > Then: 


Prob(min \\T{p) — T(a;)|| > r) >1 — 5. 


Proof. We assume that k > 3. For any x G X we have: 


Prob(||T(p — x)|| < r) = Prob 

< Prob 


r( 

T 


P-X X 

Up “ x\\ 

( ) 

^ \\p — x\\ 


< 


\\p-x\ 


d) d^' 


due to (j3|). Therefore, by the union bound, 

Prob(min ||T(p) — T(x)|| > r) = 1 — Prob(min ||T(p) — T(x)|| < r) 

> 1 - Prob(||r(p) - T(x)|| < r) > 1 - |X| 

x^X 

The RHS is greater than or equal to 1 — 5 if and only if (y)^ > which is equivalent to fe > 
as claimed. □ 


Note that d is often unknown and can be arbitrarily small. However, if both p, X are integral, then 

logXi 

d > 1 and we can select k > f in the above proposition. 


In many cases, the set X is infinite. We show that when this is the case, we can still overcome 
this difficulty under some assumptions. In particular, we prove that if X = {Ax \ x G Z” } where A 
is an m X n matrix with integer coefficients which are all positive in at least one row, then for any 
bounded vector b G Z™ the problem 6 G X is equivalent, with high probability, to its projection to a 
0(log n)-dimensional space. The idea is to separate one positive row and apply random projection 
to the others. 
















3 SETS WITH LOW DOUBLING DIMENSION 


6 


Formally, let us denote by a® the i-th row and by aj the j-th column of A. Assume that all entries 
in the row a® is positive and all entries of b are bounded by a constant B > 0. Remove the row i 
from A and b to obtain A = {a[,..., a'^) G i!jn-i)xn ^ ^ Let T : —>■ be a JL 

random mapping and denote hy Z = {x G Z” | a® • x = 6*}. Then we have: 

2.4 Proposition 

Assume that b ^ X, and let 0 < 5 < 1. Using the terminology and given the assumptions above, if 
k > ^ ^ log(n + B — 1) we have 

Prob^T(6) 7 ^ '^^XjT{aj) for all x G z\ >1 — 5 
V j=i ' 

for some constant C > 0. 


Proof. We first show that jZj < (n + B — 1)^. Since all the entries of A are positive integers, we 
have 

n n 

\Z\ < \{x GZl\ = ^ ^ I = 

i=i i=i 

The number of elements in the RHS corresponds to the number of combinations with repetitions of 
B items sampled from n, which is equal to < {n + B — 1)^. 


Next, by Lemma 11.41 we have: 

Prob^T( 6 ) / '^^XjT{aj) for all x € Z j > 1 — 2(n + B — 1) 


1=1 




( 5 ) 


which is greater than 1 — 5 when taking any k such that k > ^ln(|) + ■^log(n + B — 1). The 
proposition is proved. □ 


Note that in Prop. 12.41 we can choose the JL random mapping T as a matrix with {—1, +1} entries 
(Rademacher variables). In this case, there is no need to worry about floating point errors. 


3 Sets with low doubling dimension 


In this section, we denote by B{x, r) the closed ball centered at x with radius r > 0, and Bx{x, r) = 
B{x,r) n X. We will also assume that X is a doubling space, i.e. a set with bounded doubling 
dimension. One example of doubling spaces is a Euclidean space. M®”, we can show that the doubling 
dimension log 2 (Ax) of X can be shown to be a constant factor of m f |16l [ 6 ]). However, many sets 
of low doubling dimensions are contained in high dimensional spaces (in)- Note that computing the 
doubling dimension of a metric space is generally NP-hard ([5]). We shall make use of the following 
simple lemma. 

3.1 Lemma 

For any p G X and s,r > 0, there is a set S ^ X of size at most such that 

Bx{p,r) ^ U B{s,e). 

seSj 
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Proof. By definition of the doubling dimension, Bx {p, r) is covered by at most Ax closed balls of 
radius Each of these balls in turn is covered by Ax balls of radius and so on: iteratively, for 
each A: > 1, Bx{p,r) is covered by A^ balls of radius If we select k = [log 2 (^)l then k > log 2 (^), 

he. ^ <£. This means Bx{p,r) is covered by balls of radius e. □ 

We will also use the following lemma, which is proved in [8] using a concentration estimation for 
sum of squared gaussian variables (Chi-squared distribution). 

3.2 Lemma 

Let X C B(0,1) be a subset of the m-dimensional Euclidean unit ball. Then there exist universal 
constants c, C > 0 such that for k > Clog Ax + 1 and 5 > 1, the following holds: 

Prob(3x G X s.t. ||ra:|| > <5) < 

In the proof of the next result (one of the main results in this section), we use the same idea as that 
in [8] for the nearest neighbor problem. 

3.3 Theorem 

Given 0 < d < I and p ^ X C R™. Let T : R™ —^ R^ be a Gaussian random projection. Then 

Prob(T(p) ^ r(X)) = 1 

if k > Clog 2 (Ax), for some universal constant C. 

Proof. Let e > 0 and 0 = tq < ri < r 2 < ... be positive scalars (their values will be defined later). 
For each j = 1, 2, 3,... we define a set 

Xj = X n B{p, Tj) \ B{p, Vj-i). 

Since Xj C Bx{p,'rj), by Lemma [3.II we can find a point set Sj C X of size \Sj\ < ^ such 

that 

X, C IJ B{s,e). 

s&Sj 

Hence, for any x G Xj, there is s G Sj such that ||x — s|| < e. Moreover, by the triangle inequality, 
any such s satisfies Vj-i — e < ||s — p|| < Vj + e, so without loss of generality we can assume that 

Sj C B{p, rj + e) \ B{p, rj_i - e). 

We denote by Sj the event that: 

3s G Sj, 3x G Xj n B(s, e) s.t. ||rs — Tx\\ > £\/j. 

By the union bound, we have 

Prob(Tj) < Prob(3x G Xj n B(s, e) s.t. \\Ts — Tx\\ > e^/j') 

s&Sj 

< (fQj. some universal constant ci by Lemma l3.2j) 

s&Sj 
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oo 

Prob(3x G X s.t T{x) = T{p)) = Prob(3x G Xj s.t T{x) = T{p)) 

i=i 

OO 

< ^ Prob(3x G Xj s.t T{x) = T{p)). 


Now we will estimate the individual probabilities: 

Prob(3x G Xj s.t T{x) = T{p)) 

< Prob((3x G Xj s.t T{x) = T{p)) A Sj) + Prob(£lj) 

< Prob(3x G Xj, s ^ Sj D B{x,e) s.t T{x) = T{p) A \\T{s) — T{x)\\ < e\/j) + Prob(£lj) 

< Prob(3s G Sj s.t ||r(s) - T{p)\\ < e^) + 

Next, we choose £ = for some large N] and for each j > 1, we choose Vj = {2 + j)e. For j < N — 2, 
by definition it follows that Xj = 0 . Therefore 

Prob(3x G Xj s.t T{s) = T{p)^ = 0. 

On the other hand, for j > N — 2, 

Prob(3s G Sj s.t ||r(s) - r(p)|| < E^) 

< ^ Prob(||T(z)|| < ——) for an arbitrary z G 

0-1 “ ^ 

= Prob(||T( 2 ;)|| < for an arbitrary z G 

< A^°S 2 ( 3 +i)l A:/2 by the estimation dH. 

Note that < y|^§ 2 ( 6 + 2 j) _ _|_ 2 jy°S 2 >'x < j( 2 iog 2 Ax) fQj. large enough N. Therefore, we 

have 

Prob(3x G Xj s.t T{x) = T{p)) < A^ (^j-k /2 ^ ^-c^kj^ 

< _|_ Q-cakj 

for some universal constants C2,C3, provided that k > CilogXx for some large enough constant Ci. 
Finally, by the union bound, 

Prob(r(p) ^ r(A)) = 1 - Prob(r(p) G T{X)) 

OO 

> 1- ^ (^-C2fc + g-C30) 

i=N-2 


which tends to 1 when N tends to inhnity. 


□ 


Our final result in the section is an extension of Thm. 13.dl to the threshold case. 
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3.4 Theorem 

Let p ^ X C M™, T : —>• be a Gaussian random projection, and d = min lip — xll. Then for all 

x^X 

0 < 5 < 1 and all 0 < t < Kd for some constant k < 1, we have 


if k is 0( 


log(^) 


Prob(dist(r(p),r(X)) > r) > 1 - 5 


Proof. For j = 1, 2,... we construct the sets Xj, Sj similarly as those in the proof of Thm.[3]3] (where 
the values of rj and e will be defined later). Then we have 

CX) 

Prob(3x G X s.t ||r(x) — r(p)|| < r) = Prob(3x G ||r(x)-r(p)||<r) 

CX) 

< Prob(3x G Xj s.t ||T(x) — T(p)|| < r). 

For all j > 1, we have 

Prob(3x G Xj s.t ||r(x) — r(p)|| < r) 

< Prob(( 33 : G Xj s.t ||T'(a;) — T'(p)|| < r) /\£j) + Prob(£'j) 

< Prob(3x G Xj, s £ Sj D B{x,£) s.t ||T(x) — T(p)|| < r A ||r(s) — T{x)\\ < £\/j) + Prob{£j) 
Prob(3s G Sj s.t ||r(s) -r(p)|| < r + e^) + 


< 


Now we choose e = ^ for some > 0 such that 1 + ^ ^ and for each j > 1, we choose 

rj = Ty/j + 1 + (2 + j)e. For j = 1, by the union bound we have 

Prob(3s G s.t ||r(s) — T'(p)|| < r + eVl) 

< ® Prob(||r(z)|| < for an arbitrary z G 

= l^°§ 2 (4+Af\/2)] p|.ob^||r(z)|| < (1 + fo'^ arbitrary z G 


< 


< 


ybog2(4+Af\/2)] / 


+ 


nUJ 


fc /2 


by estimation 


/ 1 N 


C2k 


( 6 ) 


for some universal constant C 2 > 0, as long as k > Clog (Ax) for some C large enough. 


For j > 2, we have 

Prob(3s G Sj s.t \\T{s) - T{p)\\ < r + £^/j) 

— ^ ^ for an arbitrary z G 

rj-i — £ 

= ^^°S2(3+i+'^^Vi+i)l Prob(||r( 2 ;)|| < for an arbitrary z G 

^ ^bog2(3+i+iV^^)l .-k /2 estimation 0 

< 


(7) 
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for some universal constant C 3 > 0, as long as k > Clog (Ax) for some C large enough. 
Similarly, for all 1 < j, we have 

^^og2(^^)l g-cifcj ^ ^-C4,kj ^ 

for some universal constant C 4 > 0 , as long as k > Clog (Ax) for some C large enough. 
From estimations dSD, ©, m and by the union bound we have: 


( 8 ) 


Prob(dist(T(p), T(X)) > r) > 1 — ^ Prob(dist(T(p), T(Xj)) < r) 


I / 1 X "T 

2 + 


C2k 00 


g-c4kj 


J=2 


i=i 


> 1-5 


log( Av ^ 

for k = 0{ ) large enough. 


□ 
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